Deep Learning Attention

Back to Home

01. Introduction to Attention
02. Sequence to Sequence Recap
03. Encoding -- Attention Overview
04. Decoding -- Attention Overview
05. Attention Overview
06. Attention Encoder
07. Attention Decoder
08. Attention Encoder & Decoder
09. Bahdanau and Luong Attention
10. Multiplicative Attention
11. Additive Attention
12. Additive and Multiplicative Attention
13. Computer Vision Applications
14. NLP Application: Google Neural Machine Translation
15. Other Attention Methods
16. The Transformer and Self-Attention
17. Notebook: Attention Basics
18. [SOLUTION]: Attention Basics
19. Outro

Back to Home

08. Attention Encoder & Decoder

In machine translation applications, the encoder and decoder are typically

Generative Adversarial Networks (GANs)

Recurrent Neural Networks (Typically vanilla RNN, LSTM, or GRU)

Mentats

SOLUTION:

Recurrent Neural Networks (Typically vanilla RNN, LSTM, or GRU)

Word Embeddings

What's a more reasonable embedding size for a real-world application?

4

200

6,000

SOLUTION:

200

What are the steps that require calculating an attention vector in a seq2seq model with attention?

Every time step in the model (both encoder and decoder)

Every time step in the encoder only

Every time step in the decoder only

SOLUTION:

Every time step in the decoder only

udacimak v1.3.0